subservers: fail LiT startup when integrated sub-server boot fails #1183

ffranr · 2025-12-02T20:22:15Z

Treat integrated sub-servers as fatal to startup and return an error if any fail to start.
Propagate integrated sub-server startup errors to LiT so it stops launching and records the failure status.
Update docs to reflect integrated sub-servers are now critical to startup.

gemini-code-assist · 2025-12-02T20:22:35Z

Summary of Changes

Hello @ffranr, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the reliability of the LiT startup sequence by elevating the status of integrated sub-servers to critical. Previously, LiT could continue operating even if some of its integrated sub-servers failed to launch. Now, any failure in an integrated sub-server's boot process will cause LiT itself to cease its startup, ensuring that the system only reaches an operational state when all essential integrated components are functioning correctly. This change provides a more robust and predictable initialization experience.

Highlights

Critical Integrated Sub-Server Startup: Integrated sub-servers are now treated as critical components for LiT (Lightning Terminal) startup. If any integrated sub-server fails to initialize, LiT will now abort its launch process.
Error Propagation: The StartIntegratedServers function in the sub-server manager has been updated to return an error if any integrated sub-server fails to start, allowing LiT to properly handle and propagate these failures.
LiT Startup Robustness: LiT's main startup routine (terminal.go) now explicitly checks for errors returned by StartIntegratedServers and will set an errored status and halt its own startup if such failures occur.
Documentation Update: Comments within the subservers/interface.go file have been updated to reflect the new critical nature of integrated sub-servers during LiT startup.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request modifies the startup logic to treat integrated sub-server startup failures as fatal. The changes correctly propagate errors from StartIntegratedServers up to the main start function, causing LiT to fail on startup as intended. The documentation has also been updated to reflect this new behavior. The implementation is sound, but I have one suggestion to improve the consistency of error handling.

terminal.go

ViktorT-11

Thanks for this @ffranr 🙏!

In addition to the feedback I've commented below, this new behaviour definitely needs itest coverage.

I think we actually started working on this at the same time, and I have local branch with draft code implementing this + itest coverage. If you want to, i can clean that up and push it so that you can cherry-pick that to make it more simple for you. Let me know if that'd be helpful :).

subservers/manager.go

terminal.go

lightninglabs-deploy · 2025-12-12T01:17:49Z

@ffranr, remember to re-request review from reviewers when ready

ellemouton

havent actually looked at the diff here yet, just want to make a note in case: we should just make sure that the daemon still runs & ie, that the status server still gets served

* Treat integrated sub-servers as fatal to startup and return an error if any fail to start. * Propagate integrated sub-server startup errors to LiT so it stops launching and records the failure status. * Update docs to reflect integrated sub-servers are now critical to startup.

- Ensure critical integrated sub-servers initialize first. - Introduce alphabetical sorting for consistent order across startup runs.

- Introduce tests for critical and non-critical sub-server startup behavior. - Ensure failures in critical servers stop startup, while non-critical failures are tolerated.

ellemouton · 2025-12-18T14:34:40Z

subservers/manager.go

+			if criticalIntegratedSubServers.Contains(ss.Name()) {
+				return fmt.Errorf("%s: %v", ss.Name(), err)
+			}
+


but i think this then will result in complete shutdown which i dont think we want? we want the status server to remain running.

ViktorT-11

Thanks for the updates 🚀! Leaving some additional feedback in addition to @ellemouton's feedback.

ViktorT-11 · 2025-12-19T11:00:57Z

itest/litd_mode_integrated_test.go


+// testCriticalTapStartupFailure ensures LiT exits quickly when a critical
+// integrated sub-server (tapd) fails to start during boot.
+func testCriticalTapStartupFailure(ctx context.Context, net *NetworkHarness,


Thanks for this 🙏! Similar to the unit tests, we should also have an itest which covers a startup when a non-critical sub-server errors during startup.

ViktorT-11 · 2025-12-19T11:03:56Z

subservers/manager.go

 		if err != nil {
 			s.statusServer.SetErrored(ss.Name(), err.Error())
+
+			if criticalIntegratedSubServers.Contains(ss.Name()) {


Thanks 🙏!

Note that the commit message is outdated and does not cover the "criticalIntegratedSubServers" part.

ViktorT-11 · 2025-12-19T11:26:08Z

terminal.go

+		if client, err := g.basicLNDClient(); err == nil {
+			stopCtx, cancel := context.WithTimeout(
+				ctx, 5*time.Second,
+			)
+			defer cancel()
+
+			_, err := client.StopDaemon(
+				stopCtx, &lnrpc.StopRequest{},
+			)
+			if err != nil {
+				log.Warnf("Error stopping lnd after failed "+
+					"start: %v", err)
+			}
+		}


This will now attempt to shutdown lnd for all types of startup errors, not just the "critical sub-server" error.

I'm not sure that this is exactly what we want, for 2 main reasons:

This will apply in remote mode as well, and attempt to shutdown the remote lnd node for any litd related startup error.

Even in integrated mode, it's not certain that we've gotten to executing the code that sets basicLNDClient when g.start errors, despite having started lnd. Meaning that lnd will not be shutdown in that scenario. That leads to a quite unpredictable behaviour, where lnd will sometimes be shutdown when g.start errors, and sometimes not.

I therefore think we should only execute this if we specifically error during the startup of a critical litd sub-server, to keep the behaviour predictable, and not effect remote lnd nodes. You may want to guard the shutdown request and require that litd is not running with a "remote" lnd node, even though the current critical sub-server startup error logic cannot occur when lnd is in remote mode.

ViktorT-11 · 2025-12-19T11:27:16Z

terminal.go

+				err)
+		}
+
+		return startErr


Similar to what @ellemouton has already commented, this will now shutdown litd before the shutdownInterceptor.ShutdownChannel() has been triggered, which we would like to avoid.

ViktorT-11 · 2025-12-19T11:28:11Z

terminal.go

+		if err := g.shutdownSubServers(); err != nil {
+			log.Errorf("Error shutting down after failed start: %v",
+				err)
+		}


We should only do this after the shutdownInterceptor.ShutdownChannel() has been triggered, similar to the current logic below :)

ffranr self-assigned this Dec 2, 2025

gemini-code-assist bot reviewed Dec 2, 2025

View reviewed changes

terminal.go Show resolved Hide resolved

ziggie1984 self-requested a review December 3, 2025 09:14

ViktorT-11 self-requested a review December 3, 2025 12:29

ffranr force-pushed the wip/fail-startup-on-subserver-err branch from 57ee325 to 86e8ec5 Compare December 4, 2025 13:21

ffranr mentioned this pull request Dec 4, 2025

[feature]: use future/promise design for interaction with lnd lightninglabs/taproot-assets#1893

Open

ViktorT-11 reviewed Dec 5, 2025

View reviewed changes

subservers/manager.go Outdated Show resolved Hide resolved

subservers/manager.go Outdated Show resolved Hide resolved

terminal.go Show resolved Hide resolved

jtobin added this to Taproot-Assets Project Board Dec 9, 2025

github-project-automation bot moved this to 🆕 New in Taproot-Assets Project Board Dec 9, 2025

ellemouton reviewed Dec 12, 2025

View reviewed changes

ffranr force-pushed the wip/fail-startup-on-subserver-err branch 4 times, most recently from 1772099 to 0091958 Compare December 16, 2025 17:11

ffranr requested review from ViktorT-11 and ellemouton December 16, 2025 17:11

ffranr added 5 commits December 18, 2025 00:17

subservers: sort sub-servers deterministically during integrated startup

35954a9

- Ensure critical integrated sub-servers initialize first. - Introduce alphabetical sorting for consistent order across startup runs.

itest: add integrated-mode fail-fast test for tapd startup errors

75f58f2

subservers: add unit tests for integrated server startup logic

334bf77

- Introduce tests for critical and non-critical sub-server startup behavior. - Ensure failures in critical servers stop startup, while non-critical failures are tolerated.

docs: add release note

381d0b1

ffranr force-pushed the wip/fail-startup-on-subserver-err branch from 0091958 to 381d0b1 Compare December 18, 2025 00:17

ellemouton reviewed Dec 18, 2025

View reviewed changes

ViktorT-11 reviewed Dec 19, 2025

View reviewed changes

subservers: fail LiT startup when integrated sub-server boot fails #1183

Are you sure you want to change the base?

subservers: fail LiT startup when integrated sub-server boot fails #1183

Uh oh!

Conversation

ffranr commented Dec 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gemini-code-assist bot commented Dec 2, 2025

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

ViktorT-11 left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

lightninglabs-deploy commented Dec 12, 2025

Uh oh!

ellemouton left a comment

Choose a reason for hiding this comment

Uh oh!

ellemouton Dec 18, 2025

Choose a reason for hiding this comment

Uh oh!

ViktorT-11 left a comment

Choose a reason for hiding this comment

Uh oh!

ViktorT-11 Dec 19, 2025

Choose a reason for hiding this comment

Uh oh!

ViktorT-11 Dec 19, 2025

Choose a reason for hiding this comment

Uh oh!

ViktorT-11 Dec 19, 2025

Choose a reason for hiding this comment

Uh oh!

ViktorT-11 Dec 19, 2025

Choose a reason for hiding this comment

Uh oh!

ViktorT-11 Dec 19, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

ffranr commented Dec 2, 2025 •

edited

Loading

ViktorT-11 left a comment •

edited

Loading